{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3aae4fde",
   "metadata": {},
   "source": [
    "\n",
    "# Exemplar Project: Meta-Analysis of Survival Data in Clinical Research\n",
    "\n",
    "This notebook implements a meta-analysis project focused on Tyrosine Kinase Inhibitors (TKIs) versus traditional chemotherapy for advanced Non-Small Cell Lung Cancer (NSCLC).\n",
    "\n",
    "## Topics Covered\n",
    "- DerSimonian and Laird inverse variance method\n",
    "- Forest plots for treatment comparisons\n",
    "- Publication bias assessment with funnel plots\n",
    "- Mantel-Haenszel estimator for small meta-analyses\n",
    "\n",
    "### Dataset Information\n",
    "The dataset includes four Randomized Controlled Trials (RCTs) and their event data:\n",
    "- Events in experimental (TKI) and control (chemo) groups\n",
    "- Risk ratios for Progression-Free Survival (PFS)\n",
    "\n",
    "### Prerequisites\n",
    "Install the required package before running the code:\n",
    "```bash\n",
    "pip install PythonMeta matplotlib numpy pandas\n",
    "```\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "864863fc",
   "metadata": {},
   "source": [
    "\n",
    "## Meta-Analysis Dataset\n",
    "\n",
    "The event data for the four RCTs is as follows:\n",
    "| Study               | Events (TKI) | Subjects (TKI) | Events (Chemo) | Subjects (Chemo) |\n",
    "|---------------------|--------------|----------------|----------------|------------------|\n",
    "| Shi Y et al (2017)  | 91           | 138            | 106            | 122              |\n",
    "| Fukuoka M et al (2011)| 76         | 96             | 90             | 94               |\n",
    "| Mok T et al (2009)  | 533          | 609            | 586            | 608              |\n",
    "| Sequist L et al (2013)| 153        | 230            | 104            | 115              |\n",
    "\n",
    "We will use this data for the meta-analysis.\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "82f68922",
   "metadata": {},
   "source": [
    "\n",
    "## DerSimonian and Laird Method\n",
    "\n",
    "This method calculates the pooled risk ratio using the inverse variance weighting approach.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa84e253",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "# Define the dataset\n",
    "data = pd.DataFrame({\n",
    "    \"Study\": [\"Shi Y et al (2017)\", \"Fukuoka M et al (2011)\", \"Mok T et al (2009)\", \"Sequist L et al (2013)\"],\n",
    "    \"Events_TKI\": [91, 76, 533, 153],\n",
    "    \"Subjects_TKI\": [138, 96, 609, 230],\n",
    "    \"Events_Chemo\": [106, 90, 586, 104],\n",
    "    \"Subjects_Chemo\": [122, 94, 608, 115]\n",
    "})\n",
    "\n",
    "# Calculate risk ratios and variances\n",
    "data[\"Risk_Ratio\"] = (data[\"Events_TKI\"] / data[\"Subjects_TKI\"]) / (data[\"Events_Chemo\"] / data[\"Subjects_Chemo\"])\n",
    "data[\"Variance\"] = (1 / data[\"Events_TKI\"]) + (1 / data[\"Subjects_TKI\"]) + (1 / data[\"Events_Chemo\"]) + (1 / data[\"Subjects_Chemo\"])\n",
    "\n",
    "# Calculate weights\n",
    "data[\"Weight\"] = 1 / data[\"Variance\"]\n",
    "\n",
    "# Pooled Risk Ratio\n",
    "pooled_rr = np.sum(data[\"Risk_Ratio\"] * data[\"Weight\"]) / np.sum(data[\"Weight\"])\n",
    "pooled_variance = 1 / np.sum(data[\"Weight\"])\n",
    "\n",
    "print(f\"Pooled Risk Ratio: {pooled_rr:.2f}\")\n",
    "print(f\"Pooled Variance: {pooled_variance:.2f}\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbb62319",
   "metadata": {},
   "source": [
    "\n",
    "## Forest Plot\n",
    "\n",
    "The forest plot visualizes the individual study risk ratios and the pooled estimate.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "828b7d95",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Forest plot\n",
    "plt.errorbar(data[\"Risk_Ratio\"], range(len(data)), xerr=np.sqrt(data[\"Variance\"]), fmt=\"o\", label=\"Studies\")\n",
    "plt.axvline(pooled_rr, color=\"red\", linestyle=\"--\", label=\"Pooled Risk Ratio\")\n",
    "plt.yticks(range(len(data)), data[\"Study\"])\n",
    "plt.xlabel(\"Risk Ratio\")\n",
    "plt.title(\"Forest Plot\")\n",
    "plt.legend()\n",
    "plt.gca().invert_yaxis()  # Reverse y-axis for readability\n",
    "plt.show()\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5089f94",
   "metadata": {},
   "source": [
    "\n",
    "## Funnel Plot\n",
    "\n",
    "The funnel plot evaluates potential publication bias by plotting precision against effect size.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a9696b1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Calculate standard errors\n",
    "data[\"StdError\"] = np.sqrt(data[\"Variance\"])\n",
    "\n",
    "# Funnel plot\n",
    "plt.scatter(data[\"Risk_Ratio\"], 1 / data[\"StdError\"], alpha=0.7)\n",
    "plt.axvline(pooled_rr, color=\"red\", linestyle=\"--\", label=\"Pooled Risk Ratio\")\n",
    "plt.xlabel(\"Risk Ratio\")\n",
    "plt.ylabel(\"Precision (1 / StdError)\")\n",
    "plt.title(\"Funnel Plot\")\n",
    "plt.legend()\n",
    "plt.show()\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d2993a0",
   "metadata": {},
   "source": [
    "\n",
    "## Mantel-Haenszel Method\n",
    "\n",
    "This method is more appropriate for meta-analysis with fewer studies (e.g., <10).\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c4f72bd0",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Mantel-Haenszel weight calculation\n",
    "data[\"MH_Weight\"] = 1 / (data[\"Events_TKI\"] + data[\"Events_Chemo\"])\n",
    "\n",
    "# Mantel-Haenszel pooled Risk Ratio\n",
    "mh_rr = np.sum(data[\"Risk_Ratio\"] * data[\"MH_Weight\"]) / np.sum(data[\"MH_Weight\"])\n",
    "mh_variance = 1 / np.sum(data[\"MH_Weight\"])\n",
    "\n",
    "print(f\"Mantel-Haenszel Risk Ratio: {mh_rr:.2f}\")\n",
    "print(f\"Mantel-Haenszel Variance: {mh_variance:.2f}\")\n",
    "        "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
